36 research outputs found

    Cost estimation of spatial join in spatialhadoop

    Get PDF
    Spatial join is an important operation in geo-spatial applications, since it is frequently used for performing data analysis involving geographical information. Many efforts have been done in the past decades in order to provide efficient algorithms for spatial join and this becomes particularly important as the amount of spatial data to be processed increases. In recent years, the MapReduce approach has become a de-facto standard for processing large amount of data (big-data) and some attempts have been made for extending existing frameworks for the processing of spatial data. In this context, several different MapReduce implementations of spatial join have been defined which mainly differ in the use of a spatial index and in the way this index is built and used. In general, none of these algorithms can be considered better than the others, but the choice might depend on the characteristics of the involved datasets. The aim of this work is to deeply analyse them and define a cost model for ranking them based on the characteristics of the dataset at hand (i.e., selectivity or spatial properties). This cost model has been extensively tested w.r.t. a set of synthetic datasets in order to prove its effectiveness

    Fluorescence Spectrometric Determination of Drugs Containing α-Methylene Sulfone/Sulfonamide Functional Groups Using N1-Methylnicotinamide Chloride as a Fluorogenic Agent

    Get PDF
    A simple spectrofluorometric method has been developed, adapted, and validated for the quantitative estimation of drugs containing α-methylene sulfone/sulfonamide functional groups using N1-methylnicotinamide chloride (NMNCl) as fluorogenic agent. The proposed method has been applied successfully to the determination of methyl sulfonyl methane (MSM) (1), tinidazole (2), rofecoxib (3), and nimesulide (4) in pure forms, laboratory-prepared mixtures, pharmaceutical dosage forms, spiked human plasma samples, and in volunteer's blood. The method showed linearity over concentration ranging from 1 to 150 Όg/mL, 10 to 1000 ng/mL, 1 to 1800 ng/mL, and 30 to 2100 ng/mL for standard solutions of 1, 2, 3, and 4, respectively, and over concentration ranging from 5 to 150 Όg/mL, 10 to 1000 ng/mL, 10 to 1700 ng/mL, and 30 to 2350 ng/mL in spiked human plasma samples of 1, 2, 3, and 4, respectively. The method showed good accuracy, specificity, and precision in both laboratory-prepared mixtures and in spiked human plasma samples. The proposed method is simple, does not need sophisticated instruments, and is suitable for quality control application, bioavailability, and bioequivalency studies. Besides, its detection limits are comparable to other sophisticated chromatographic methods

    A cost model for spatial join operations in SpatialHadoop

    Get PDF
    Spatial join is an important operation in geo-spatial applications, since it is frequently used for performing data analysis involving geographical information. Many efforts have been done in the past decades in order to provide efficient algorithms for spatial join and this is particularly important as the amount of spatial data to be processed increases. In recent years, the MapReduce approach has become a de-facto standard for processing large amount of data (big-data) and some attempts has been made for extending existing frameworks for the processing of spatial data. In this context, SpatialHadoop is an extension of Apache Hadoop, which includes a native support for spatial data, in terms of spatial data types, operations and indexes. In particular, its provides five different variants of spatial join which mainly differ in the use of a spatial index and in the way this index is built and used. In general, none of these algorithm can be considered better than the others, but the choice might depend on the characteristics of the involved datasets. The aim of this work is to deeply analyze the characteristics of these algorithms and to define a cost model for them which is based on some dataset characteristics (i.e., selectivity or spatial properties). The main goal of the proposed cost model is to rank the spatial join implementations by defining a partial order among them using a dominance relation. This cost model has been extensively tested w.r.t. a set of synthetic datasets in order to prove its effectiveness

    A Comparison of Distributed Spatial Data Management Systems for Processing Distance Join Queries

    Get PDF
    Due to the ubiquitous use of spatial data applications and the large amounts of spatial data that these applications generate, the processing of large-scale distance joins in distributed systems is becoming increasingly popular. Two of the most studied distance join queries are the K Closest Pair Query (KCPQ) and the Δ Distance Join Query (ΔDJQ). The KCPQ finds the K closest pairs of points from two datasets and the ΔDJQ finds all the possible pairs of points from two datasets, that are within a distance threshold Δ of each other. Distributed cluster-based computing systems can be classified in Hadoop-based and Spark-based systems. Based on this classification, in this paper, we compare two of the most current and leading distributed spatial data management systems, namely SpatialHadoop and LocationSpark, by evaluating the performance of existing and newly proposed parallel and distributed distance join query algorithms in different situations with big real-world datasets. As a general conclusion, while SpatialHadoop is more mature and robust system, LocationSpark is the winner with respect to the total execution time

    RkNN Query Processing in Distributed Spatial Infrastructures: A Performance Study

    Get PDF
    The Reverse k-Nearest Neighbor (RkNN) problem, i.e. finding all objects in a dataset that have a given query point among their corresponding k-nearest neighbors, has received increasing attention in the past years. RkNN queries are of particular interest in a wide range of applications such as decision support systems, resource allocation, profile-based marketing, location-based services, etc. With the current increasing volume of spatial data, it is difficult to perform RkNN queries efficiently in spatial data-intensive applications, because of the limited computational capability and storage resources. In this paper, we investigate how to design and implement distributed RkNN query algorithms using shared-nothing spatial cloud infrastructures as SpatialHadoop and LocationSpark. SpatialHadoop is a framework that inherently supports spatial indexing on top of Hadoop to perform efficiently spatial queries. LocationSpark is a recent spatial data processing system built on top of Spark. We have evaluated the performance of the distributed RkNN query algorithms on both SpatialHadoop and LocationSpark with big real-world datasets. The experiments have demonstrated the efficiency and scalability of our proposal in both distributed spatial data management systems, showing the performance advantages of LocationSpark

    Cost estimation of spatial join in spatialhadoop

    No full text

    Using Deep Learning for Big Spatial Data Partitioning

    No full text

    Spatial partitioning techniques in SpatialHadoop

    No full text

    Molecular characterization of ochratoxigenic fungi associated with poultry feedstuffs in Saudi Arabia

    No full text
    Fungal and mycotoxins contamination of food and poultry feeds can occur at each step along the chain from grain production, storage, and processing. A total of 200 samples comprising of mixed poultry feedstuffs (n = 100) and their ingredients (n = 100) were collected from Riyadh, Alhassa, Qassium, and Jeddah cities in Saudi Arabia. These samples were screened for contamination by fungi. Penicillium chrysogenum was the predominant species taking into its account and frequency, respectively, in both mixed poultry feedstuff and barley samples (4,561.9 and 687 fungal colony-forming units (CFU)/g) and (66% and 17%). Moisture content was an important indicator for the count of fungi and ochratoxin A. Ochratoxin analysis of plate cultures was performed by a HPLC technique. Sample of mixed poultry feedstuff which was collected from Jeddah displayed the highest level of ochratoxin (14.8 ”g/kg) and moisture content (11.5%). Corn grains samples were highly contaminated by ochratoxin A (450 and 423 ”g/kg) and recorded the highest moisture contents (14.1 and 14.5%). Ochratoxin A production in fungal species isolated from mixed poultry feedstuff samples were high with P. verrucosum (5.5 Όg/kg) and A. niger (1.1 Όg/kg). In sorghum and corn grains, the highest ochratoxins producing species were P. viridicatum (5.9 Όg/kg) and A. niger (1.3 Όg/kg), respectively. Sixty-three isolates of A. niger were ochratoxigenic, and all of them showed the presence of pks genes using PKS15C-MeT and PKS15KS primer pairs. The detection technique of A. niger in poultry feedstuff samples described in the present study was successfully used as a rapid and specific protocol for early detection of A. niger without cultivation on specific media
    corecore